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Many statistical hypotheses can be formulated in terms of poly- 
nomial equalities and inequalities in the unknown parameters and 
thus correspond to semi-algebraic subsets of the parameter space. 
We consider large sample asymptotics for the likelihood ratio test of 
such hypotheses in models that satisfy standard probabilistic regular- 
ity conditions. We show that the assumptions of Chernoff 's theorem 
hold for semi-algebraic sets such that the asymptotics are determined 
by the tangent cone at the true parameter point. At boundary points 
or singularities, the tangent cone need not be a linear space and 
limiting distributions other than chi-square distributions may arise. 
While boundary points often lead to mixtures of chi-square distribu- 
tions, singularities give rise to nonstandard limits. We demonstrate 
that minima of chi-square random variables are important for locally 
identifiable models, and in a study of the factor analysis model with 
one factor, we reveal connections to eigenvalues of Wishart matrices. 

1. Introduction. Let V® = (Pq \ 6 S 0) be a parametric family of prob- 
ability distributions on some measurable space. Suppose that is an open 
subset of M. k . For a hypothesis 0o Q 0, consider testing 

(1.1) H :d£e vs. Hx:9ee\@ 

based on a large sample taken from a distribution in V®. Under regularity 
conditions, the null distribution of the likelihood ratio statistic for the testing 
problem (1.1) can be approximated by the chi-square distribution Xc with 
degrees of freedom c equal to the codimension of ©o, that is, c = k — dim(©o) . 
The necessary regularity conditions combine probabilistic conditions on V® 
with geometric smoothness assumptions about ©o- For example, the asymp- 
totic approximation for the likelihood ratio test is valid when V® is a regular 
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exponential family and @o a smooth manifold, in which case the submodel 
Pq = (Pq I £ Q ) is called a curved exponential family [18]. 

In this paper we consider the situation where probabilistic regularity con- 
ditions about V® hold but the null hypothesis Oo is a semi-algebraic set, 
that is, a set defined by polynomial equalities and inequalities in 9. A semi- 
algebraic set has nice local geometric properties but it may have boundary 
points as well as singularities at which x 2 -asymptotics are no longer valid. 
(For a rigorous definition of singularities of algebraic sets see, e.g., [3] , Section 
3.2 or [7], Section 9.) The case of semi-algebraic sets is important because 
many statistical hypotheses exhibit this special structure [9, 13]. Moreover, 
tools from algebraic geometry help in studying semi-algebraic sets and allow 
to make progress in the understanding of the likelihood ratio test. 

Boundary points of statistical hypotheses have been discussed in the lit- 
erature and often lead to asymptotic distributions that are mixtures of % 2 - 
distributions. Two classic examples where boundary issues arise are variance 
component models [20] and factor analysis [25]; see also [24]. Singularities, 
however, do not seem to have received as much attention. For example, the 
parameter spaces of factor analysis models, which we will take up later in 
this paper, contain singularities at which the asymptotic distribution of the 
likelihood ratio statistic is not a x 2 - m ixture. 

Issues with singularities can be illustrated nicely for hypotheses about 
the mean vector of a bivariate normal distribution A/2 with the covari- 
ance matrix equal to the identity matrix /. For a closed set Oo C := R 2 , 
the likelihood ratio statistic A n for testing (1.1) is equal to the product of 
the sample size n and the squared Euclidean distance between the sample 
mean vector and Qq. The following two examples demonstrate nonstandard 
asymptotics for A n ; the connection to tangent cones is based on a result of 
Chernoff [6] that we will revisit in this paper. 

Example 1.1 (Nodal cubic). Let 60 = {fJ- £ R 2 | /x 2 . = + /i 2 } be the 
curve on the left in Figure 1, which can be parametrized as f(t) = [t 2 — 
1, t(t 2 — 1)]. The curve has a singularity at the point of self-intersection fi = 0. 




Fig. 1. Nodal and cuspidal cubic. 
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The lines \ii = ±/ii in the plot indicate the tangent cone at \x = 0. If fx = 
is the true parameter point and n — > 00, then the likelihood ratio statistic 
A n converges to the distribution of the squared Euclidean distance between 
a draw from A/"2(0, 1) and the lines 112 = ±Mi, that is, the distribution of the 
minimum of two independent x 2 - ran dom variables. 

Example 1.2 (Cuspidal cubic). Let 0o = {p £ M 2 \ n\ = lA} be the 
curve with parametrization f(i) = (t 2 ,t s ) shown on the right in Figure 1. 
If the true parameter point is the cusp p = 0, then the asymptotic dis- 
tribution of A n is the mixture \x\ + }>X2- This is the distribution of the 
squared Euclidean distance between a draw from A/2 (0,1) and the tangent 
cone {/i I [i\ > 0, fi2 = 0} . 

In the above examples, points other than the origin are smooth points 
at which the curves locally look like a line. Thus, away from the origin 
the standard x 2 -asymptotics, here xh a PPl y - However, while x 2 -limits arise 
almost everywhere, the convergence is not uniform and a very large sample 
size may be required for the x 2 -distribution to provide a good approximation 
to the distribution of A n if the true parameter is close to the singular locus. 
An important point is also that limiting distributions at singularities can be 
stochastically larger (Example 1.2) as well as smaller (Example 1.1) than 
the ^-distribution obtained at smooth points. 

The remainder of this paper begins with a review of the asymptotic theory 
for the likelihood ratio test (Section 2). We then show that the geometric 
regularity conditions in this theory are satisfied for semi-algebraic hypothe- 
ses (Section 3). In Section 4 we discuss algebraic methods that are helpful 
for determining tangent cones of semi-algebraic sets and can be used in par- 
ticular to bound the asymptotic p- value of the likelihood ratio test. These 
methods are applied to factor analysis in Sections 5 and 6. Concluding re- 
marks are given in Section 7. 

2. Likelihood ratio tests and tangent cones. Suppose we observe a sam- 
ple of independent and identically distributed random vectors , . . . , 6 
M. m and that the distribution of XW belongs to the statistical model 
V® = (Pq I S O). We assume that the distributions in V® are dominated by 
a common cr-finite measure v with respect to which they have probability 
density functions pg :M. m — > [0, 00). For sample realizations i", . . . let 

i=l 

be the log-likelihood function of the model V®. For Qq C 0, a maximum 
likelihood estimator # ni e hi the (sub-)model V® = (Pg \ 9 £ Qq) satisfies 

£n(0n,e o ) = max £ n (6). 
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The likelihood ratio statistic for testing the fit of Vq , that is, for testing 
(1.1), is 

(2.1) A re = 2(sup4(0)- sup4(0)Y 

Veee eee / 

For our study of large sample asymptotics for A n , we base ourselves on van 
der Vaart [34], Chapter 16, and make the following probabilistic regularity 
assumptions. Recall that the model V® is differentiable in quadratic mean 
at 9 G if there exists a measurable map £g : R m — > R fc such that 

i™ ]j/~jj2 J R m(^/ Pe+h ^ ~ "J Pe ^ ~ \ ht ^{x)\jpe{x)^j du(x) = 0. 



Lemma 7.6 in [34] gives a simple sufficient condition for differentiability in 
quadratic mean. 

Definition 2.1. A statistical model V® is regular at 9 e C R fc if the 
following conditions hold: 

(i) the point 9 is in the interior of 0, which is assumed to be nonempty; 

(ii) the model V® is differentiable in quadratic mean at with an invert- 
ible Fisher- information matrix 1(9) = Eg[£g(X)£g(XY]; 

(iii) there exists a neighborhood U(9) C of 9 and a measurable function 
£:W m — > R, square-integrable as J Rm £(x) 2 dPg(x) < oo, such that 

\log Pei (x)-logpe 2 (x)\ <£(x)\\9 1 -9 2 \\ W 1 ,9 2 eU(9). 

(iv) the maximum likelihood estimator n e is consistent under Pg. 

Example 2.2. Let 9 = M^j Xm be the cone of symmetric positive definite 
m x m-matrices. The centered multivariate normal distributions (M m (0, X) | 
£ G 0) form a regular exponential family (the natural parameter space is 
an open set). Such a family is regular in the sense of Definition 2.1 at all of 
its parameter points. In subsequent examples we tacitly identify the space 
of symmetric m x m-matrices, denoted R™^ m , with 2 > and index the 
vectors in the latter space by ordered pairs ij with i < j. The inverse I(S) _1 
of the Fisher- information for £ = (o~ij) is then the ( m ^ _1 ) x C^ 1 ) -matrix with 
(ij, /cf)-entry a ik aj£ + a ie a jk . 

For well-behaved large sample asymptotics of the likelihood ratio statistic 
A n at a true parameter point 6>o in the null hypothesis ©o, the probabilistic 
assumptions made above need to be complemented with assumptions about 
the local geometry of ©o at #o- This local geometry expresses itself in the 
tangent cone. 
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Definition 2.3. The tangent cone Tq(9) of the set 9 C R k at the point 
9 G R k is the set of vectors in R k that are limits of sequences a n (9 n — 9), 
where a n are positive reals and 9 n G converge to 9. 

The tangent cone Tq(9) is a closed set that is a cone in the sense that 
if r £ T®(9) then the half-ray {At | A > 0} is contained in T@(9). We state 
some properties of tangent cones that can be found for example in [23]. 

Lemma 2.4. Let 9eR k and 0, 9i, . . . , m C R k . 

(i) T 0lU ...ue m (0) = T Gl (0) U • ■ • U T 0m (0) . 

(ii) T 0in ...ne m (0) C T Gl (0) n ■ • ■ n r 0m (0) . 

(hi) If Q — 9 is a cone, then Tq{9) is the topological closure of Q — 9. 

(iv) Let = f (r) for some differentiable map f : R d — ► M fc and some open 
se£ r C If 9 = f (7) /or some 7 G T, t/ien Tq(9) contains the linear space 
spanned by the columns of the Jacobian 

V #7j / 

The next definition describes a regularity requirement on the local geom- 
etry of a hypothesis ©o at a point 9q; see [23], Proposition 6.2 and [14]. 

Definition 2.5. The set C R k is Chernoff-regular at 9 if for every 
vector r in the tangent cone Tq(6) there exist e > and a map a : [0, e) — > 
with a(0) = 9 such that r = lirm ; _ > o+ [ct(t) — a(0)]/t. 

Under Chernoff-regularity, likelihood ratio statistics converge to Maha- 
lanobis distances between a random draw from a multivariate normal distri- 
bution and the tangent cone at the true parameter point 9q. This result first 
appeared in [6]. The version given here is proven in [34], Theorem 16.7. Note 
that under Chernoff-regularity the sets -^(©o — 9q) converge to T® (9o) in 
the sense of [34]; compare [14]. 

Theorem 2.6. Let 9q G ©0 Q © C R k be a true parameter point at which 
the model Vq is regular with Fisher-information I(9q). Let Z ~ A/fc(0, I(9q)^ 1 ). 
If ©o is Chernoff-regular at 9q and the maximum likelihood estimator # n ,6o 
is consistent, then as n tends to infinity, the likelihood ratio statistic X n 
converges to the distribution of the squared Mahalanobis distance 

min (Z -T) t I(9 )(Z -r). 
reT @0 (9 y 



If I(9q) = I{9o) t / 2 I{9o) 1 / 2 and Z~A4(0, 1) is a standard normal vector, 
then the squared Mahalanobis distance has the same distribution as the 
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squared Euclidean distance 

min ||Z-/(0 o ) 1/2 t|| 2 
reT eo (e ) 

between Z and the linearly transformed tangent cone I(6o) 1 ^ 2 Tq (9o) . 

We remark that changing the matrix square root I(^o) 1 ^ 2 corresponds to 
an orthogonal transformation under which Euclidean distances as well as 
the standard normal distribution are invariant. 

If the tangent cone Te (#o) hi Theorem 2.6 is a d-dimensional linear 
subspace of then we recover the standard x 2 - as y m Pt°ti cs because the 
squared Euclidean distance between a d-dimensional subspace and a stan- 
dard normal vector follows a x 2 _,rdistribution. Another well-known case 
arises if T@ (8q) is a convex cone. In this case the limiting distribution is 
a mixture of ^-distributions with degrees of freedom larger than or equal 
to the codimension of Tq (9q); see [19, 25, 26, 28, 29, 31] and Example 1.2. 
These mixtures are also known as chi-bar-square distributions. 

The next example gives another important type of nonstandard limiting 
distributions that we have already encountered in the artificial Example 1.1. 



Example 2.7. Suppose we wish to test whether the marginal indepen- 
dence X\ _LL X2 and the conditional independence X\ _LL X2 \ A3 hold si- 
multaneously in (Xi,X2,X^) t ~ J\fs(0, S). This is the case if and only if the 
unknown covariance matrix S = (fly) satisfies that CT12 = and <7i3<723 = 0. 
Define the linear space Lj = {z G Kgym | Z12 = ^3 = 0} for i = 1,2. The null 
hypothesis Go C ^sym comprises the positive definite matrices in L\ U L2. 
A true covariance matrix So that is not diagonal belongs either to L\ or to 
L2 such that the tangent cone Te (So) is equal to L\ or L2, respectively. 
Since dim^T^) = 4, it follows that \ n converges to a Xei-4 = xl'distribution. 
If, however, So is diagonal then Go — So coincides with the closed cone 
L\ U L2 near the origin and, by Lemma 2.4(iii), Te (So) = L\ U L2. The 
Fisher-information /(So) and its symmetric square-root /(So) 1 / 2 are now di- 
agonal. Diagonal transformations leave the cone Li U L2 invariant such that 
by Theorem 2.6, A n converges in distribution to the minimum of Zf 2 + Zf 3 
and Z\ 2 + -^23 f° r a standard normal random vector Z G . This is the 
distribution of W\2 + min(I^i3, W23), where W\2-, W13 and W23 are indepen- 
dent Xi- ran dom variables. We note that this example can also be worked 
out by elementary means [8]. 



Examples in which Chernoff-regularity fails and the likelihood ratio statis- 
tic X n does not converge in distribution can be found in [9] and [14]. 
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Remark 2.8 (Comparing nested submodels). The above setup considers 
problem (1.1) that compares the fit of the submodel V@ to the fit of the 
saturated model V@. Instead, one may wish to compare to the fit of another 
submodel Vq 1 , that is, test Hq : 9 S Go versus Hi : 9 G Gi \ Go, where Go Q 
Gi C Q. However, if Theorem 2.6 applies to both (1.1) and the problem 
Hq : 9 G Gi versus H\ : 9 € G \ Qi, then we can deduce that the asymptotic 
distribution of the likelihood ratio statistic 

X n = 2( sup £ n (9)- sup t n {9) 
\0e0i eee 

is given by the difference of the squared Mahalanobis distances between the 
random vector Z ~ A4(0, I(9q)^ 1 ) and the two tangent cones, namely 



n— >oo 
>d 



min (Z -dflldoMZ -d) 

eeTe (e ) v 



e >7^ (z " 0) * /(0o)(z " 0) 



Remark 2.9 (Maximum likelihood estimators). Under the conditions 
of Theorem 2.6, the maximum likelihood, estimator Q n } q can be shown to 
be distributed like the projection of Z ~ A4(0, I(9q)~ 1 ) on the tangent cone 
Te (^o) [34], Theorem 7.12. Here, projection refers to the minimizer of the 
Mahalanobis distance (Z — 9) t I(9o){Z — 9) over 9 G T@ (9q). This minimizer 
is almost surely unique [14], Proposition 4.2. For results on local maximizers 
of the likelihood function, see [27]. 

3. Semi-algebraic hypotheses. In principle, Chernoff regularity has to be 
verified in every application of Theorem 2.6. However, as we detail in this 
section, Chernoff regularity is automatic if the hypothesis Go is given by a 
semi-algebraic set. The map a in Definition 2.5 can be chosen very smooth 
for semi-algebraic sets. 



3.1. Semi- algebraic sets. We begin by briefly reviewing some of the prop- 
erties of semi-algebraic sets. In-depth treatments can be found in [2, 3, 5]. 



Definition 3.1. Let R[t] = R[h, . . . ,tk] be the ring of polynomials in 
the indeterminates t\,...,tk with real coefficients. A basic semi- algebraic set 
is a subset of M fc that is of the form 

Q = {9 G R k | f(9) =0V/GF, h{9) > V/i € 

where F,H C M[t] are (possibly empty) collections of polynomials and H is 
finite. A semi- algebraic set is a finite union of basic semi-algebraic sets. If 
H = then Q is called a real algebraic variety. 
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Complements, finite unions and finite intersections of semi-algebraic sets 
are again semi-algebraic. If T C M. d is semi-algebraic and f : R d — > M fc a poly- 
nomial map (or a rational map defined everywhere on T), then the image 
f (r) is a semi-algebraic set. The parameter spaces of many statistical models 
are such images; compare [22]. 

A semi-algebraic set can be written as a disjoint union of finitely many 
smooth manifolds 0i,...,0 s such that if 0j and the closure cl(Oj) have 
a nonempty intersection then 0j C cl(0j) and dim(Oj) < dim(0j). Such a 
partition is known as a stratification of 0. The dimension of can be defined 
as the largest dimension of any manifold in the stratification. If = f (r) is 
the image of an open semi-algebraic set T under a polynomial map f , then 
dim(0) is equal to the maximal rank of any Jacobian of f (7) for 7 G V. This 
maximal rank is achieved at almost every 7. 

Definition 3.2. For a point 9 in the semi-algebraic set 0, define d m to 
be the dimension of the semi-algebraic set Qf]Bi/ m (9), where Bi/ m (9) is the 
open ball of radius 1/m around 6. The sequence (d m ) m eN being nonincreas- 
ing, there exists mo such that d m = d mo for all m > tuq. The local dimension 
of 9 is defined to be dimg(0) := d mo , which is no larger than dim(0). 

If there exists a ball B r {9) such that the semi-algebraic set n B r {9) 
is a (i-dimensional smooth manifold then 9 is a smooth point and its local 
dimension is dim#(0) = d. Almost all points 9 of a semi-algebraic set are 
smooth of local dimension dime(0) = dim(0). In other words, the set of 
points 9 G with dimg(0) < dim(0) is a subset of dimension smaller than 
dim(0). 

A semi-algebraic set, even a real algebraic variety, may have smooth points 
of different local dimensions. For example, the so-called Whitney umbrella 
defined by x 2 z = y 2 in IR 3 has two-dimensional smooth points, which arise 
if x 7^ or y 7^ 0. The points with x = y = and z > are not smooth, but 
the points x = y = and z < lie on a line and are thus one-dimensional 
smooth points. However, if = f (r) for an open semi-algebraic set V and a 
polynomial map f, then is pure-dimensional in the sense that dime(0) = 
dim(0) for all 9 £ [11]. 

3.2. Tangent cones of semi-algebraic sets. If is semi-algebraic, then 
the tangent cone Tq{9) at a point 9 G is also a semi-algebraic set. The 
dimension of the tangent cone T®{9) is at most dim(0) and may be strictly 
smaller even for polynomial images of open semi-algebraic sets. For example, 
if f : (s,t) 1 — ^ (s 2 + t 2 ,s 3 + t 3 ) then f(M 2 ) is the two-dimensional set that 
includes all points that are on or to the right of the cuspidal cubic shown 
in Figure 1. At the origin, this two-dimensional set has the one-dimensional 
dashed half-ray as tangent cone. 
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Despite this possible difference between the dimension of the tangent cone 
and the dimension of the set itself, the tangent cones to semi-algebraic sets 
are very well-behaved: the vectors in the tangent cone are tangent to very 
smooth curves in the semi-algebraic set. This has the following important 
consequence that ensures the existence of limiting distributions in many 
examples. 

Lemma 3.3. A semi- algebraic set C R fe is everywhere Chernoff regu- 
lar. 

Proof. Proposition 2 of [21] shows that if G is a real algebraic vari- 
ety and t £ T®(8) for some 9 6 @, then there exists a real analytic curve 
a: [0,e) — > with a(0) = 9 that for t £ [0,e) has a convergent Taylor series 
expansion as a (t) = T t r + 0(t r+1 ) with r > 1. The result for semi-algebraic 
sets can be proven in exactly the same fashion by altering Claim 2 in [21] to 
make a requirement about inequalities hj(9) > 0. By Lemma 2.4(i) it suffices 
to consider a basic semi-algebraic set. □ 

Remark 3.4. Let (Pq \ G 0) be a regular exponential family. Drton 
and Sullivant [9] define a submodel (Pg \ 9 £ ©o) to be an algebraic expo- 
nential family if ©o C is diffeomorphic to a semi-algebraic set. By Lemma 
3.3, @o is everywhere Chernoff-regular such that Theorem 2.6 applies at 
every point 9q £ ©o a t which the maximum likelihood estimator 9 n Q is 
consistent. According to [4], Theorem 3.1, Corollary 3.3, #n,e is consistent 
if @o is locally compact at 9q. A semi-algebraic set need not be locally com- 
pact. However, the likelihood ratio statistic A n does not change if in (2.1) 
we replace ©o by the union of ©o and the closure of B e (9o) n ©o- This clo- 
sure is meaningful for small e because 9q is in the interior of 0. With this 
change ©o is locally compact at 9q, and we can deduce that the first-order 
asymptotics of the likelihood ratio test for testing the goodness-of-fit of an 
algebraic exponential family are always given by Mahalanobis distances from 
the tangent cone. 

3.3. Locally identifiable models. Suppose Vq = (Pg \ 9 £ ©) is an identi- 
fiable model, that is, Pq = P§ implies 9 = 9. Let r C M. d be an open semi- 
algebraic set and f : T — > M fc a polynomial or rational map such that ©o = 
f(r) C 0. The submodel Vq that is parameterized by f is globally identi- 
fiable at 70 G r if 70 is the unique point in T that is mapped to #o = f (70)- 
The submodel Vq is locally identifiable at 70 G T if there exists a neighbor- 
hood U(jo) C T of 70 such that f _1 (#o) H ^(70) = {70}- Local identifiability 
often arises as assumed in the following proposition, where J(7) denotes the 
Jacobian of f at 7. 
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Proposition 3.5. Let 9q be a true parameter point at which Vq is regu- 
lar and the maximum likelihood estimator 9 nt Q consistent. Suppose f~ 1 (9o) 
is a finite set such that the Jacobian J{^) has full rank at all 7 G f~ 1 (6* ). 
If f is proper at 9q, that is, there exists a compact neighborhood l^CM' of 
9q such that f~ 1 (V r ) n T is compact in M. d , then the likelihood ratio statistic 
X n for (1.1) converges to the distribution of a minimum of at most |f _1 (#o)| 
random variables with xt-dim(e )~ distribution. 

Proof. By the full rank assumption, there exist neighborhoods U(j) C 
r of 7 G f~ 1 (^o) such that M 7 = f(f/(7)) are smooth manifolds. Consider a 
sequence (9 n ) = (f(7 n )) in Oo that converges to 9q. Since 9 n £V for all large 
n, the sequence (j n ) is eventually contained in the compactum f _1 (V) n 
r. Since f is continuous, all accumulation points of (y n ) are in the finite 
preimage f~ 1 (^o)- Therefore, 

In e |J u( 7 ) 

for all n larger than some no € N. It follows that locally at 6q, the set Qq is 
equal to a finite union of the smooth manifolds M 7 , 7 E f -1 (#o)- According 
to Lemma 2.4(i), the tangent cone Tq (9q) is the finite union of the tangent 
spaces of the manifolds M~, which are the linear spaces L~ f spanned by the 
columns of J (7). 

Let Z ~ A4(0, I{9q)). The limiting distribution of X n is the distribution 

of 

min ( min (Z - T) t I(9 )(Z - r) ). 

Since the L 7 are linear spaces of dimension dim(0o), the displayed expression 
is a minimum of x^_ dim ( 0o )-random variables. If L 7 = L 7 for 7 7^ 7 G f~ 1 (6'o), 

then only one of two associated x 2_var i a bles needs to be included in the 
minimum. □ 

Proposition 3.5 makes no statement about the dependence of the x 2 - 
random variables in the minimum. In the artificial Example 1.1, the two 
xf-random variables were independent but, as illustrated next, this is not 
the case in general. 

Example 3.6. Suppose e±, . . . ,£4 are independent normal random vari- 
ables distributed as £j ~ JV(0, cjj) with u>i > 0. Consider the system of linear 
equations 

Yi = e u Y 2 = p 2 iYi + (324Y 4 + E2, 



Y 3 = [3 31 Y 1 + p 32 Y 2 + e 3 , Y 4 = p i3 Y 3 + e 4 , 
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Fig. 2. Graph with feedback loop. 



which has a feedback loop among Y2, Y3 and I4; compare the graphical 
representation in Figure 2. In matrix form, the equations state that if Y = 
(Yi, . . . , Y±f and e = (ei, . . . , £4)* then BY = e for 





1 








^ 




-021 


1 





-024 




-031 


-fa 


1 





V 








-043 


1 / 



Let 

D = {(021,. ■ -,043)' € I det(B) = 1 - 032024043 + 0} 

and r = Dx(0, oo) 4 . The map from (0, u) £ T to the covariance matrix of Y 
is rational; denote it by f . The set Go = f (r) is a semi-algebraic subset of the 
cone of positive definite matrices = M 4 ,^ 4 - It is the parameter space of the 
Gaussian model Vq = (Af^O, S) | S G Qq) that is induced by the equation 
system. 

The Jacobian of f is of rank at least 8 for all (0,u>) G T. It is of full rank 
9 unless 

(3.2) #51 + falhl = and 0320A3024 = -1. 

Details of calculations that yield this and other facts employed here are given 
in Appendix A.l. By Lemma A.l, the model Vq is globally identifiable at 
(0,u) unless 

(3.3) 31 + 3 2021 = 0, /?32/?43/W-l and /? 3 2,/?43,/W0. 

If satisfies (3.3), then "Pe is locally identifiable with the preimage of 
S = f(0,u) always being of cardinality two. Moreover, by Lemma A. 2, the 
map f is proper at E. Hence, in this locally identifiable case, the likelihood 
ratio statistic \ n converges to the distribution of the minimum of two xf- 
random variables. 

Suppose the true parameter point is So = f(0,uj) with as in (3.3). Using 
Lemma 2.4(iv), it can be shown that the tangent cone Te (So) is equal 
to the union of two hyperplanes whose normal vectors rj and fj have zero 
components except at their 13- and 14-entries. The nontrivial entries are 

(3-4) ?7(13,14) = 0943,-1)* 
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and 

(3 - 5) ??(13 ' M) " I 1 '" u i + ft 3 u 3 + Pi 2 (3lu 2 ) ■ 

Equation (3.4) is readily obtained by computing the kernel of the transposed 
Jacobian of f at (f3,u). Equation (3.5) follows from replacing ^43 by the 
component ^43 of the second vector ((3, to) with f(/3,u>) = So; compare (A. 2) 
where k, = l/wj. 

In order to describe the limiting distribution of the likelihood ratio statis- 
tic more precisely, we need to consider the transformed tangent cone 

/(£ o ) 1/2 Tg o (0 o ), which is the 

union of the two hyper planes with normal 
vectors /(So)"'/ 2 /? and I(So)~*/ 2 ?7. The cosine of the angle between these 
two normal vectors is equal to 



/r ? *7(S )- 1 7 7l -^(So)- 1 772 
This expression simplifies to 

^ ^ „ _ /?43^ 3 + /?43$j2 w 2 ~ ^24^32^4 



V (W3 + /?32 w 2 + /3| 4 /3| 2 W 4)(W4 + /?43 w 3 + fihfill^) 

recall the formula for the inverse Fisher- information matrix /(So) -1 from 
Example 2.2. We may thus conclude that X n converges to the distribu- 
tion of the squared Euclidean distance between a standard normal point 
Z ~ A/2 (0,7) in M 2 and two lines through the origin that intersect at angle 
cos _1 (p). 

If 031 + /?32/32i = and at least one of the parameters ^32, ^43, P24 zero, 
then V@ is globally identifiable at (f3,u>), the Jacobian of f of full rank, but 
Proposition 3.5 does not apply as explained at the end of Appendix A.l. It 
is interesting that in this case, the limiting distribution of A n is not a x 2 - 
distribution, which is shown in Proposition A. 4 in Appendix A. 2. This fact 
as well as results on the remaining cases are obtained using the algebraic 
techniques we present in the next section. 

4. Algebraic tangent cones and bounds on p-values. In this section, we 
explain how algebraic tools can help find smooth and non-smooth points of 
a semi-algebraic hypothesis. Algebraic methods also allow one to compute 
(asymptotic) bounds on p-values for the likelihood ratio test. 

4.1. Boundary points and singularities. For a semi-algebraic set SCI*, 
define 2T(0) to be the ideal of polynomials / £ R[t] that vanish whenever 
evaluated at a point 9 E O. By Hilbert's basis theorem, the ideal X(O) has 
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a finite generating set {fx, . . . , ft} C M[t]. In other words, there exist finitely 
many polynomials fi,...,ft such that / G X(0) if and only if / = /ii/i + • • • + 
/i^/^ for some /ij 6l[t], The real algebraic variety defined by the vanishing 
of all polynomials in X(0), or equivalently the polynomials fi,...,ft, is the 
Zariski closure 9 of 0. The Zariski closure is the smallest real algebraic 
variety containing and in particular dim(G) = dim(0). 

Definition 4.1. Let C R k be a semi-algebraic set with Zariski closure 
0. A subset of is open in if it is equal to the intersection of an open set 
in ]R fc and 0. The interior int(0) is the union of all subsets of that are 
open in 0. The boundary bd(0) is the complement \ int(0). 

At a boundary point 9 G bd(0) the tangent cone need not be a linear space 
such that nonstandard limiting distributions may arise for the likelihood 
ratio statistic. However, this phenomenon may also occur at singularities. 
We recall the definition of singularities as given, for example, in [3], Section 
3.2. 

A real algebraic variety is irreducible if it cannot be written as the 
union of two strict subsets that are also real algebraic varieties. Any real 
algebraic variety can be written as a finite union of irreducible varieties, 

(4.1) = 1 U---U0^. 

If no two varieties 0j and Qj in (4.1) are ordered by inclusion, then the 
decomposition in (4.1) is unique up to reordering and the irreducible vari- 
eties 0j are called the irreducible components of 0. Let {fi, ■ ■ ■ , fe} CR[t] 
generate the ideal 2T(0), and let J(9) G M. ixk be the Jacobian with r/th entry 
dfi(0)/d9j. Let r(0) be the maximum rank of any matrix J (9), G 0. Then 
r(0) is independent of the choice of the generating set {fx, ...,f{\ and it 
holds that r(0) = k — dim(0). 

Definition 4.2. Let 6 be a point in the real algebraic variety 6CK 1 . 

(i) If is irreducible and the rank of J(9) is smaller than r(0) then 9 
is a singular point of 0. 

(ii) If 0i, . . . , Qg are the irreducible components of then 9 is a singular 
point of if it is a singular point of some 0j or if it is in an intersection 

If is a semi-algebraic set then 9 G is a singular point of if it is a 
singular point of the Zariski closure 0. 

The software Singular [15] provides routines for computing all singular- 
ities of from a generating set {fx, . . . , fg} C R[t] for X(0). 
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A nonsingular interior point of a semi-algebraic set is also a smooth 
point with tangent cone Tq(6) that is equal to the linear space given by the 
kernel of the Jacobian matrix J(0). This fact translates into the following 
statistical result. 

Theorem 4.3. Let 9q G @o C C M fe be a true parameter point at which 
the model Vq is regular and the maximum likelihood estimator 9 n q con- 
sistent. If ©o is semi- algebraic and 9q a nonsingular interior point of ©o, 
then the likelihood ratio statistic X n converges to the Xc~ distribution with 
c = k — dim# (@o) degrees of freedom as n — > oo. 

The following is a useful condition for checking the assumption of Theo- 
rem 4.3; we use it in Proposition A.4(i) and Theorem 5.1. 

Lemma 4.4. Let @o = f(r), where F C R d is an open semi- algebraic set 
and f a polynomial or rational map. If 9q = f (70) £ 0o is nonsingular and 
the Jacobian of £ at 70 £ T of full rank, then 9q is in the interior of ®$. 

Proof. The Jacobian being of full rank, there exists a neighborhood 
U of 70 such that £(U) is a <i-dimensional smooth manifold. Since 9q is 
nonsingular, there exists a neighborhood V of 9q such that the intersection of 
V and the Zariski closure 0o is also a d-dimensional smooth manifold. Since 
f (U) C O , these two manifolds are nested by inclusion. Hence, due to their 
equal dimension, they must coincide locally. Therefore, in a neighborhood 
of 9q, the three sets i(U) C O C O are equal. It follows that 9q G int(©o). 
□ 

Nonsingularity is not necessary for x 2 -asymptotics. For example, suppose 
O e M 2 is the union of the two parabolas y = ±x 2 given by the equation 
y 2 = x 4 . The origin is a singularity of ©o with tangent cone equal to the 
x-axis. Hence, A n -^d xl at every point in ©o- Removing the part of the 
parabola passing through the positive orthant, we make the origin a singular 
boundary point at which A n — >d Xi- 

4.2. Algebraic tangent cones and bounds on p-values. For complicated 
statistical hypotheses it may be difficult to work out the tangent cone, in 
which case it is interesting to find sub- and supersets of the tangent cone. 
The Mahalanobis distances from these sub-/supersets provide distributions 
that are stochastically larger/smaller than the limiting distribution of the 
likelihood ratio statistic A n and thus can be used to bound the asymptotic 
p- value 

Poo (t)= lim P(\ n >t), t>0. 
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If a parametrization is available, then the following upper bound is im- 
mediate from Lemma 2.4(iv). 

Lemma 4.5. Suppose 9q G Go C C R fc is a true parameter point at 
which the model Vq is regular and the maximum likelihood estimator 9 n ,e 
consistent. Let ©o = f (L) be the image of an open semi-algebraic set T C M. d 
under a polynomial map f :]R rf — > M. k . Let J(ff) be the Jacobian of f at 7. 
If m is the maximum rank of any Jacobian J (7) with 7 G {~ 1 (9q), then 
Poc(t)<P(xt_ m >t) for allt>0. 

For a lower bound based on a ^-distribution one could employ the so- 
called Zariski tangent space given by the kernel of the Jacobian matrix J(9q) 
that, as in Definition 4.2, is derived from a generating set . . . , fi} C M[t] 
of the ideal Z(©o). However, if the true parameter #0 is a singularity of ©0 
then the Zariski tangent space is of larger dimension than ©o and does not 
provide a good local approximation to ©o- For instance, the Zariski tangent 
space at the cusp singularity in Example 1.2 comprises all of M 2 and thus the 
lower bound is trivially zero because it is computed from a ^"distribution, 
which is a point mass at zero. 

A better local approximation to ©0 is obtained from the algebraic tangent 
cone defined in (4.3) below. The algebraic tangent cone is sometimes easier 
to compute than the tangent cone. Grobner basis methods to automate the 
computation [7], Section 9.7, are implemented, for example, in Singular 
[15]. We note that in Example 1.2, the algebraic tangent cone at the cusp 
is equal to the x-axis and thus provides a lower Xi-bound for Poo(t). A 
similar phenomenon arises in the feedback model from Example 3.6; see 
Proposition A.4(iii) in the Appendix. 

Let 9 be a point in a semi-algebraic set C M. k . For a polynomial / G 
X(0) C R[t] define fg to be the polynomial obtained from / by substituting 
ti + 9i for each indeterminate ti appearing in /. Write 

(4.2) fe = J2foh 

h=0 

with fg t h being a homogeneous polynomial of degree h. Define fe t mm to be 
the term fg^ that is of smallest degree among all nonzero terms in (4.2). 
The algebraic tangent cone is the real algebraic variety 

(4.3) A e (9) = {t G R k I fg, min (r) = V/ G 1(0)}. 

According to the following fact the Mahalanobis distance from the alge- 
braic tangent cone yields a lower bound for poo(t). 

Lemma 4.6. Let 9 be a point in the semi- algebraic set 0. Then Tq(9) C 
A@(9), that is, the algebraic tangent cone contains the tangent cone. 
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The algebraic tangent cone Aq(8) is a subset of the Zariski tangent space 
and has dimension equal to the largest dimension of any irreducible compo- 
nent of the Zariski closure that contains 9. For a polynomial image of an 
open semi-algebraic set, is irreducible and the dimension of Aq(9) equal 
to dim(0). However, the algebraic tangent cone can be of larger dimension 
than the tangent cone. 

The bounds on Poo(t) that we discussed in this section are derived from 
sub- and supersets of the tangent cone. It is noteworthy that if the tangent 
cone is convex and the limiting distribution a x 2 -mixture, then such bounds 
can be improved using properties of the mixture weights; compare page 80 
in [29]. However, the tangent cones at singularities are generally not convex 
as can be seen in the example of the feedback model as well as in the factor 
analysis model that we will study in the remainder of this paper. When 
studying factor analysis we will employ the following lemma. 

Lemma 4.7. For f G M[t], define to be the semi- algebraic set of points 
t G IR fc that satisfy f(t) > 0. If 9 G satisfies f(9) = 0, then the tangent cone 
T®(9) is contained in the set {r 6 M fc | /e,min( r ) > 0}. 

Proof. Without loss of generality assume that 9 = such that fo = f ■ 
Let r G ?e(0) be the limit of the sequence (a n 9 n ) with a n > and 9 n G 
converging to 9 = 0. Let / m j n = fe,mm be of degree d. Expanding / as in (4.2) 
we see that the nonnegative numbers af l f(9 n ) are equal to / m m(ctn#n) plus a 
term that converges to zero as n — > oo. Thus, f m \ n {T) = linin^oo frmn{ot n 9 n ) > 
0. □ 

5. Local geometry of the one-factor analysis model. Assuming zero means 
to avoid notational overhead, the factor analysis model with m observed vari- 
ables and t hidden factors is the family of multivariate normal distributions 
A/" m (0, S) with covariance matrix S in the set 

(5.1) F mA = {A + TL* | A G M™ xm diagonal, L G R mx£ }. 

Im + 1\ 

The set F m ^ is a semi- algebraic subset of lR™^ 1 m ~ 1^ 2 ) with dimension 
equal to the minimum of m(l + 1) — (^) and ( m ^ _1 ); see, for example, [10]. 
This set has singularities and to our knowledge there have been no attempts 
in the literature to clarify the role these singularities play for asymptotic 
distribution theory. (Aspects of nonsingular boundary points created by al- 
lowing the matrix A in (5.1) to be positive semi-definite have been discussed 
in [25] and we will not treat these so-called Heywood-cases here.) In this sec- 
tion we derive the tangent cones in factor analysis with I = 1 factor. The 
distributional implications are discussed in Section 6. 
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In the remainder of this section we assume that m > 4 such that the set 
F m> i is of dimension 2m < ). We begin by describing the ideal T(F m ^i). 
Let R[t] be the ring of polynomials in the indeterminates (tij | 1 <i < j < m). 
Define 

%n = {tijtgh - Ugtjh, ktitjg ~ Ugtjh \l<i<j<g<h<m}C R[t]. 

The 2(7) quad rics in T m are referred to as tetrads in the statistical literature. 
According to Theorem 16 in [10], the set T m generates the ideal T[F m \). 

Theorem 5.1. Let £ = ((Tij) G F mt ± be the covariance matrix of a dis- 
tribution in the one-factor analysis model with m > 4. If there exist at least 
two nonzero off- diagonal entries o~ij and o~ uv with i < j and u < v, then £ 
is a nonsingular and smooth point of F m ^. If at most one off-diagonal en- 
try o~ij with i<j is nonzero, then £ is a singular point of F m \. All points 
£ G -F m ,i are of local dimension dims(i ? m ,i) = dim(F mi i) = 2m. 

Proof. The claim about local dimension holds because F m \ is the im- 
age of a polynomial map; compare Section 3.1. Proposition 32 in [10] states 
that a matrix £ = (o~ij) G F m \ is a singularity if and only if at most one 
off-diagonal entry o~ij with i < j is nonzero. Let f : (0,oo) m x IR m — > F m> \ be 
the parametrization map that sends (5,T) to the matrix A + rr* G -F m ,i, 
where A is the diagonal matrix diag(<5). In order to show that a nonsingular 
point S = f(<5, r) is a smooth point, we check that the ( m ^ _1 ) x 2m-Jacobian 
of f at (6,T) is of full rank 2m; recall Lemma 4.4. 

If S = f(5, T) has entries cr^ ^ and a uv ^ for two distinct pairs 
and (u, v) with i < j and u<v, then T = (7^) G M m must have at least three 
nonzero entries. Without loss of generality, assume that 71,72,73 7^ 0. Parti- 
tion the Jacobian matrix of f by partitioning the columns according to the 
split between 5 and 7, and by partitioning the rows into the diagonal and 
the off-diagonal entries of S. Since daij/ddk = if i < j, the Jacobian matrix 
of f is block-triangular. One of the diagonal blocks, namely the submatrix 
filled with the partial derivatives dan/ 96 j, is the m x m-identity matrix. 
Hence, the Jacobian is of full rank 2m if and only if the matrix of partial 
derivatives do~ij/djk = 0, i < j, is of rank m. To see that this rank is in- 
deed m, form the m x m-submatrix of partial derivatives daij/d^fk = with 
G {(1, 2), . . . , (1, m), (2, 3)}. This submatrix has determinant equal to 
2tI"~ 2 7273 in absolute value. Since 71,72,73 ^ 0, the determinant is nonzero. 
□ 

If S G F m i is a nonsingular point, then the tangent cone Tp ml (S) is a 
linear space. At the singularities of F mj i two types of tangent cones arise, 
which we derive in Lemmas 5.2 and 5.6. 
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Lemma 5.2. If £ € i^mi o, diagonal matrix, then the tangent cone 
Tp ml (S) is equal to the (topological) closure of the set 



Proof. In a neighborhood of the origin, the set F m \ — £ is equal to 
T m \. The set T m> i is a cone such that the claim follows from Lemma 2.4(iii). 



and define A n to be the diagonal matrix with ith diagonal entry equal to 
the negated square of the ith entry of T n . Then the sequence (A n + r n r^) 
converges to a matrix that is zero except for the (1,2)- and (2, 3)-entries 
that are equal to 1. This limiting matrix is not in T m ^. 

Remark 5.4. (i) The result in Lemma 5.2 also carries over to the case of 
£> 1 factors, (ii) The algebraic tangent cone Ap m X (S) at a diagonal matrix 
S is simply the real algebraic variety defined by the tetrads T m . 

Lemma 5.5. Let £ E F m ,\ have exactly one nonzero off-diagonal entry 
o~ij with i < j. Then the algebraic tangent cone Aj? ml (S) is equal to the 
set of matrices S € M™^ m that satisfy the following two conditions: (i) the 
([m] \ {i,j}) x ([m] \ {i,j}) -principal submatrix of S is diagonal, and (ii) 
the rank of the {i,j} X ([m] \ {i,j})- submatrix of S is at most one. Here, 
[m] :={l,...,m}. 

Proof. The set F mjl is the image of the set (0,oo) m x M m under a 
polynomial map. Thus the dimension of Ap ml (Y l ) is equal to dim(F mj i) = 
2m. Without loss of generality we assume that i = 1 and j = 2. 

Let T = (t g h) be the symmetric matrix of indeterminates. All 2 x 2-minors 
of the {1,2} x {3, . . . , m}-submatrix of T are in the ideal X(i ? mi i). Since 
none of these minors involve the indeterminate t±2, the {1,2} x {3, . . . , m}- 
submatrix of a (symmetric) matrix S S Ap m X (S) must have rank at most 
one. 

Let 3 < g < h < m. Then the quadric 

(5-2) tntgh — hghh = °~vtf>gh + (*12 _ °~Yl)tgh ~ ^\g^2h 

is a tetrad in T m . After substituting t\2 + a\2 for t\2, the lowest degree term 
on the right-hand side in (5.2) is a^tgh- Since a 12 7^ 0, a matrix S = (sij) £ 
AF m ,i(£) must have the (off-diagonal) entry s g h = 0. 



T mjl = {A + Lr I A G M^ m diagonal, fER" 1 }. 



□ 



Remark 5.3. The cone T m ,i is not closed. For example, let 
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Let C be the set of symmetric matrices for which the {1,2} x {3, . . . , m}- 
submatrix has rank at most one and all off-diagonal entries in the {3, . . . , m} X 
{3, . . . , m}-submatrix are zero. We have shown that -Ap ml (X!) C C. Matri- 
ces in C have the m diagonal entries as well as the (l,2)-entry uncon- 
strained. Since the set of 2 x (m — 2)-matrices of rank one has dimension 
(m — 2) + 1 = m — 1, the dimension of C is equal to 2m = dim(Ap m l (£)). 
The fact that C is an irreducible algebraic variety in M™^ m now implies 
Ap m = C, which is the claim. □ 

The algebraic tangent cone does not depend on the value of the nonzero 
off-diagonal entry aij. Unfortunately, this is no longer true for the tangent 
cone TF m l (£). 

Lemma 5.6. Let E G F m \ have exactly one nonzero off-diagonal entry 
Oij with i < j. Ifo-ij > 0, then the tangent cone Tp m 1 (S) is the set of matrices 
S = (s g h) G Ap m 1 (S) such that Sj g = r\- Si g for all g ^ {i,j} and some i] G 
[a^ j 'crii,o-jj j 'aij]. If aij < 0, then the analogue holds with negative multiplier 
r] G [ujj/aij,(Tij/au]. 

Proof. Without loss of generality, we assume that i = 1 and j = 2. Let 
°12 > (the case a\2 < is analogous). Denote the set of symmetric matrices 
claimed to form the tangent cone by Tp ml (E). 

We will first show that Tp x (E) C Tp m 1 (E) . We do not change the tangent 
cone Tp m l (E) if we restrict to a neighborhood of E. Hence, we can 

replace F m j_ by F^ ll = i(E), which we define to be a neighborhood of E 
in F m> i such that ipi2 > for all * = (ipij) G F^ 1 . Consider an index g > 3 
and let * = (^) = A + TT l G F^. If T = (71, .'. . )7m )*, then ifaikgihg = 
llllll - 0- It follows that ipigip2g > on i 7 ^^. Consequently, F^ -y = F^\ U 
i 7 ^ 1 with and i 7 ^' \ comprising the matrices ^ = (ipij) G F^ x for which 
tyxgi^g > and ipi g ,tp2g < 0, respectively. According to Lemma 2.4(i), the 
tangent cone of -F mj i at E is the union of the two tangent cones of F^\ and 

F e- 



Let * = (fa,) = A + LT* G F e m + with T = ( 7l , . . . , 7m ) 4 . Then 

iff- 



(5.3) V'n^s = (#i + 7i)727 9 > 7i727 9 = ^12^ 



Similarly, 

(5.4) 1p22lplg > Ipl2lp2g- 

Let S = (sij) G T F e,+ (E). By Lemma 4.7, (5.3) and (5.4) it holds that si g , S2 g > 

m,l 

0, a n s2g > OT2S13 and a 2 2Si g > ai 2 s 2g . Thus, 

(5.5) either s± g = S2 g = or ( s\ g > A — G 

V s lg 



&12 


0"22 _ 


) 


.011 


<7l2_ 
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.011 


012_ 





which implies that S2 g = rj ■ s\ g for some r\ as in the claim. Similar considera- 
tion of matrices in F^~y yields that if 5 = (s g h) G T p s,- (E) then s\ g , S2 g < 0, 

' m,l 

0nS2g < 0i2Sig and <T22Si g < 0i2<S2g- This implies an analogue to (5.5), 
namely, 

(5.6) either s\ g = S2 g = or ( s\ g < A G 

which in turn also implies that S2 g = Tj ■ si g for some 77 as in the claim. Since 
the considered index g > 3 was arbitrary and Tp m 1 (E) C Ap m r (E), we have 
proved the inclusion Tp m 1 (E) C Tp m 1 (E). 

In order to show the reverse inclusion, that is, Tp m i(E) C Tp ml (E), we 
write E = Ao + ror , where Ao = diag(5o) is a diagonal and positive definite 
matrix and T = (701,702,0, . . . ,0) 4 6R m . The pairs (701,702) that can be 
used in such a representation of E satisfy 

/ r f7\ / n A ^ 02 r- ( 0X2 022 

(5.7) 701,702 TO and G , 

7oi \0ii 012 

and any value in the interval (012/011, 022/012) is possible for their ratio. 
Let f : (0,oo) m x W 71 — > F m ^i be the parametrization map of F m i; compare 
the proof of Theorem 5.1. Let J(5, T) be the ( m ^~ 1 ) x 2m-Jacobian matrix of 
f at (5,T). By Lemma 2.4(iv), 

£'(d, c) = J(«5 , To) (j) € R™ <*, c G M m , 

is in the tangent cone Tp ml (E). Let g, h be any two distinct indices in 
{3, . . . , to}. The diagonal entries of E'(cZ, c) are 

0n(d,c) = di + 2701C1, o' 22 (d, c) = d 2 + 2702C2, a' gg (d,c) =d g . 

Choosing the values d{ appropriately, these diagonal entries may be any real 
number. The off-diagonal entries of E'(ci, c) are 

012(d) c) = C1702 + C2701, °i g {d, c) = 7oic 9 , 

0^( d > c ) = °) 023(^1 c ) = 702 c g . 

By appropriate choice of c\ and C2, cr' 12 (d, c) may take on any real value. The 
entries cr' 2g (d, c) and a'i g (d,c) are either both zero or both nonzero with their 
ratio satisfying (5.7). This is equivalent to the existence of a multiplier i] in 
the interval in (5.7) such that a' 2g (d,c) = r]a' lg (d,c) for all g > 3. Therefore, 
we have shown that any vector in Tp m 1 (E) for which the multiplier 77 is 
in the open interval in (5.7) is in the tangent cone Tp m ^E). However, the 
tangent cone is a closed set such that the same holds also if 77 is in the closure 
of the interval in (5.7), which was our claim. □ 
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r = (i. 1,1,1)' r={i, 1,1,0)' r = (i. 1,0,0)' r- (1,0, 0,0)' 
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p-value p-value p-value p-value 

Fig. 3. Histograms of 20, 000 simulated p-values for the likelihood ratio test of (6.1) with 
m = 4 and sample size n = 1000. The p-values are computed under \2 ■ The true covariance 
matrix is equal to A + FF 4 , where A = diag(l/3, 1/3, 1/3, 1/3) andT is varied as indicated 
in the histogram titles. Under these choices pairwise correlations are either zero or equal 
to 3/4. 

We remark that the description of the tangent cone in Lemma 5.6 yields a 
parametrization of the tangent cone. The multiplier r\ in this parametrization 
is unique unless Sjg = Sig = for all g<£{i,j}. 

6. Likelihood ratio tests in one-factor analysis. In this section we discuss 
the limiting distributions for the likelihood ratio statistic A n in different 
testing problems involving the factor analysis model with t = 1 factor. 

6.1. Saturated alternative. Consider testing the one- factor model against 
a saturated alternative, that is, 

(6.1) H :Z£F mil vs. i?i:££F m)1 , 

where we assume that m > 4 such that the set F m \ is of positive codimen- 
sion C™^" 1 ) — 2m. Statistical software, such as R with command f actanal, 
allows one to compute numerically the likelihood ratio statistic A n for this 
problem. In such software, p-values are computed using the x 2 -distribution 
with ( rn + 1 ) - 2m de grees of freedom. Figure 3 shows histograms of simulated 
p-values computed with f actanal. (Note that f actanal employs a Bartlett 
correction.) While the two histograms on the left confirm the expected uni- 
form distribution, this is not the case for the two histograms to the right. 
It is interesting that the p-values for L = (1, 1, 0, 0)* tend to be smaller than 
under a uniform distribution whereas the opposite is true for L = (1, 0, 0, 0)*. 

Figure 3 suggests that there should be at least three different types of 
limiting distributions for A n . The next result confirms this fact. 

Theorem 6.1. Let X n be the likelihood ratio statistic for testing (6.1). 
Assume the true covariance matrix T,q = (o~ g h) is in F m \, and define Z~ 
A/Vm+iN (0, J) to be a standard multivariate normal random vector. When 

n — > oo it holds that: 
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(i) If So has at least two nonzero entries o~ij and o~ uv with i < j and 

2 



u < v, then X n converges to the x 2 -distribution with ( m ^~ 1 ) — 2m degrees of 



freedom. 

(ii) If So has exactly one nonzero off-diagonal entry o~ij with i <j, then 
X n converges to the distribution of the squared Euclidean distance between 
Z and the (topological) closure of the set of S = (s g h) £ ^4_p ml (So) such 

that Sj g = rj • Si g for all g ^ {i,j} and some n > \pij\/^Jl — p\y Here, pij = 

(iii) i/So is diagonal, then X n converges to the distribution of the squared 
Euclidean distance between Z and the tangent cone Tp m x (So) given in Lem- 
ma 5.2. 

Proof, (i) By Theorem 5.1, this is the smooth case and the tangent 
cone a linear space of the claimed codimension. 

(iii) Let So be diagonal. Then the Fisher-information /(So) and its sym- 
metric square root /(So) 1//2 are diagonal; compare Example 2.2. The diag- 
onal entries of /(So) that are associated with off-diagonal entries erj,-, i <j, 
factor as 

I(^o)ij,ij = — — • 
33 

It follows that the tangent cone Tp m x (So) given in Lemma 5.2 is invariant 
under transformation with /(So) 1 / 2 . Hence, the claim follows from Theo- 
rem 2.6 (recall Lemma 3.3 and Remark 3.4). 

(ii) In the remaining case, So has exactly one off-diagonal element, which 
we assume to be o\2 > (the result for o~i2 < is analogous). When listing 
its rows and columns in the order 

11 < 12 < 22 < 13 < 23 < 14 < 24 < • • • < lm < 2m < 33 < 34 < • • • < mm 

the Fisher-information /(So) is block-diagonal with blocks corresponding to 
indices that are underlined together. The block for a pair (lg, 2g) with g > 3 
is 

(6-2) /( s o){i 9 .2 9 }x{i 9 ,2 5 } = (Sl2xl2) _1 - 

a gg 

Consider the following block-diagonal square root of /(So). For block 
11 < 12 < 22 take any square root and for the entries 33 < ■ • ■ < mm take the 
(univariate) square root. For the blocks lg < 2g use the Choleski-decomposition 
of (6.2) to obtain the square root 



1 



/ 1_ 0"12 \ 

y/0~11.2 0~22\j0~\\.2 



1 
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where 011.2 = 011 — 012/022- Suppose r = (r g h) is an element of Tp m 1 (S ) for 
which T2 g = fj- Tig for all 5 > 3 and fj £ (012/011 ; 022/012)- Under multiplica- 
tion with the constructed square root of /(So), r is mapped to an element 
S = (sgh) of Ap m 1 (So) for which S2 g = rj • s\ g for all 5 > 3. The multiplier rj 
is equal to 



(6.3) 

Therefore, 



(1/V022)?? ^011022- 01 2 



l/V a H-2 ~ (012/(022 V^H-s)) 7 / 022/?? -012 



(6-4) „€ ( ; " 12 ,00 = - T ^,oc 



011022 -012 V 1 "^ 12 

We considered r S Tp ml (So) with multiplier fj in the open interval 
(012/011,022/012)- By taking the closure the remaining cases are covered. 
□ 



Remark 6.2. Theorem 12.1 in the seminal paper by Anderson and Ru- 
bin [1] gives a sufficient condition for x 2 -asymptotics for the likelihood ratio 
tests in factor analysis. For the one- factor testing problem (6.1), this the- 
orem states the following. Suppose the true covariance matrix So S F m 1 
is represented as So = A + IT* with A diagonal and positive definite and 
r € M m . Then the x 2 -asymptotics from Theorem 6. 1 (i) hold if the entry- wise 
(or Hadamard) square of the matrix 

A - r(r*A _1 r) _1 r* 

has nonzero determinant (r / is required for this condition to be well 
defined). We checked that for m = 4, 5,6 this condition is indeed equivalent 
to requiring two nonzero entries above the diagonal of So- However, in the 
present context, proving Theorem 6. 1 (i) via Theorem 5.1 seems easier than 
any attempt to simplify the condition of Anderson and Rubin [1] for the 
one-factor case. 



The distribution described in Theorem 6.1(h) depends on unknown pa- 
rameters. This is not the case for the distributional bound obtained from 
the algebraic tangent cone, for which a nice connection to eigenvalues of 
Wishart matrices emerges. 

Theorem 6.3. Let V have a chi-square distribution with ( m 2~ 2 ) degrees 
of freedom and let W be distributed like the smaller of the two eigenvalues 
of a 2 x 2-Wishart matrix with m — 2 degrees of freedom and scale parameter 
the identity matrix I. If the true covariance matrix Sq = (& g h) G Fm,l has 
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exactly one nonzero off-diagonal entry o~ij with i <j, then the distribution 
of the squared Mahalanobis distance 

min (Z-S)'/(S )(Z-S), Z-N^+i^OJiZo)- 1 ), 
S£A fmjl (S ) v 2 ) 

is the distribution ofV + W, where V and W are independent. 

Proof. Without loss of generality, assume oyi 7^ 0. We can work with 
the square root of the Fisher-information /(So) that was used to prove 
Theorem 6.1(h). Due to the special block-diagonal structure of I(£o)> h 
holds that Ap ml is invariant under transformation with I(£o) 1//2 . Thus, the 
Mahalanobis distance has the same distribution as the Euclidean distance 
between Z ~ jV/m+i\ (0, /) and Ap ml . The squared Euclidean distance breaks 

into the sum of 

3<g<h<m 

and the squared Euclidean distance W between the submatrix -^{i,2}x{3,...,m} 
and the set of rank one matrices. The latter distance is equal to the smaller 
singular value of ^{i,2}x{3,...,m}- The square of this singular value is the 
smaller eigenvalue W of the 2 x 2-Wishart matrix obtained by multiplying 
^{i,2}x{3,...,m} w hh its transpose. □ 

Looking back to Theorem 6.1, we see that the ^-approximation to the 
distribution of \ n appears to be valid at almost every covariance matrix in 
F mt \. It is thus tempting to view the singularities as mere theoretical oddities 
and base inference purely on x 2 -calculations. However, this is problematic 
because the presence of singularities destroys any possible uniformity of the 
convergence of A n to a x 2 -distribution. This can be seen in Figure 4, which 
shows that the ^-approximation becomes more and more inappropriate for 
smaller and smaller pairwise correlations. A comparison with Figure 3 sug- 
gests that this phenomenon is primarily due to the model geometry: small 
correlations yield points too close to the singular locus of F m ^ . Indeed the 
distribution of X n exhibits features of the limiting distribution from Theo- 
rem 6.1 (hi); compare the histogram on the far right-hand side in Figure 3. 

6.2. Testing submodels. In the goodness-of-ht problem (6.1), ^-approxi- 
mations are valid if the true parameter point is far enough away from 
the singular locus. However, when testing submodels of a factor analysis 
model, ^-approximations may become entirely invalid. We illustrate this 
for testing the vanishing of some of the components of the parameter vector 
r = (71, . . . ,7m)* defining the covariance matrix X = A + rr* £ F m \. 
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p-value p-value p-value p-value 

Fig. 4. Histograms of 20,000 simulated p-values for the likelihood ratio test of (6.1) with 
m = 4 and sample size n = 50. The p-values are computed under %§■ The true covariance 
matrix is a correlation matrix with all off-diagonal entries equal to p, which is varied as 
indicated in the histogram titles. 

Let F m flk be the set of covariance matrices E = A + IT* in -F m ,i such that 
T = (71, . . . ,7 m )* satisfies that 7^ = 7^+1 = • • • = 7m = 0. Consider testing 

(6-5) H 0k : E G F mj0k vs. iJi : E G F m> i \ F mj0k . 

Such tests constitute edge exclusion tests in graphical models with one hid- 
den variable; compare, for example, [30]. A positive definite matrix E = (fly) 
is in F m flk if and only if the submatrix Erj;_i]x[fc-i] is in F^-i^i and aij = 
for all pairs (i,j) ^ [k — 1] x [k — 1] with i / j. Here, [k — 1] = {1, . . . , k — 1}. 
Hence, the limiting distributions of the likelihood ratio statistic for 
testing (6.5) can be determined using Remark 2.8 and Theorems 5.1 and 
6.1. 

The case k > 4 is similar to the tests considered in Section 6.1. If k > 4 
and the true covariance matrix So G F m Q k cannot be transformed into a 
matrix in -Fm,03 by permutations of rows and columns, then converges 
to a Xm+fc-i'd^tribution as n — > 00. At matrices in -F m ,03 (and the possible 
permutations thereof) nonstandard limiting distributions arise. 

The cases k < 3 are different. If k = 3, then there does not exist a true 
covariance matrix So G F m <Q k for which converges to a ^-distribution. 
For k = 1, 2, the hypotheses i?oi and i?02 are equal because F m ,oi = ^m,02 is 
the set of diagonal covariance matrices. In this case the limiting distribution 
of X rii k does not depend on So G F m fl\ = F m ^- We were not able to connect 
these distributions to any well-studied distribution but simulations can be 
used to determine the quantiles of this distribution for a valid (asymptotic) 
test of Hqi = Hq2 ■ 

When testing Hq3 the limiting distribution of X nt k depends on the cor- 
relation p\2- Nevertheless, we have the following corollary to Theorem 6.3; 
recall Lemma 4.6. 

Corollary 6.4. If \ n k is the likelihood ratio statistic for (6.5) with 
k = 3 and the true covariance matrix Sq = (o~ij) is in F m o3 with a\2 ^ 0, 
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Table 1 

Levels of the conservative test that rejects H03 if the likelihood ratio statistic exceeds the 
95%-quantile of a Wishart eigenvalue distribution. The true covariance matrices are 



correlation 


matrices 


in F m ,o3 


With p\2 


being variec 


I. Each le 


vel was 


computed in 


20,000 










simulations 










P12 






0.8 


0.7 


0.6 


0.5 


0.4 


0.3 


0.2 


m = 4 n 


= 100 


0.027 


0.028 


0.034 


0.034 


0.036 


0.042 


0.048 


n 


= 200 


0.025 


0.027 


0.029 


0.031 


0.032 


0.035 


0.043 


n 


= 500 


0.021 


0.026 


0.030 


0.031 


0.033 


0.037 


0.037 


m = 8 n 


= 100 


0.026 


0.029 


0.032 


0.035 


0.040 


0.057 


0.084 


n 


= 200 


0.026 


0.029 


0.030 


0.033 


0.035 


0.041 


0.061 


n 


= 500 


0.023 


0.027 


0.027 


0.031 


0.032 


0.035 


0.040 



then 

lim P{X n . k >t)<P(W>t), 

n— >oo 

where W is distributed like the larger of the two eigenvalues of a 2 x 2- 
Wishart matrix with m — 2 degrees of freedom and scale parameter the iden- 
tity matrix I. 

The algebraic tangent cone calculation that yields Corollary 6.4 thus leads 
to a simple and conservative test of i?03 : reject the null hypothesis if the 
observed likelihood ratio statistic is larger than the 1 — a quantile of the 
distribution of the eigenvalue W. The asymptotic level of this test is provably 
smaller than a if 012 ^ 0. Again we point out that there is no uniformity 
in the convergence to this level and large sample sizes may be required for 
smaller absolute values of 012. Table 1 shows simulated levels for this test 
using the critical values given in [16] . The increase of the level with pi2 is in 
agreement with Lemma 5.6 and Theorem 6.1(h). We note that, if desired, a 
more powerful yet still asymptotically conservative test can be obtained by 
relaxing the multiplier rj in Theorem 6.1(h) to be in [0,oo). Critical values 
for the resulting test of Hq3 could be computed using simulation. 

6.3. Comments on multi-factor analysis. Factor analysis forms a basic 
building block for graphical models with hidden variables. As such our study 
of the one-factor case is of interest for graphical models with one hidden 
variable. However, in many of its applications factor analysis serves merely 
as a tool for dimension reduction and the number of factors will typically be 
much larger than one. While the geometry of models with multiple factors 
is still largely unknown, the presented theory can offer insights into some of 
the phenomena encountered by practitioners. 
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In Section 6.2, we saw that testing the complete independence model, 
which can be viewed as the case of zero factors, against the one-factor model 
is a problem for which x 2 -approximations are inappropriate. The general- 
ization of this problem is to test the model with £ factors against the model 
with £ + 1 factors, that is, 

(6.6) H :EeF m j vs. H± : E 6 F m>i+1 \ F m>i . 

Simulations such as those in [17] suggest that, regardless of where in the null 
hypothesis the true distribution is, the likelihood ratio statistic for (6.6) does 
not follow a ^-distribution. Similarly, the limiting distribution when testing 

(6.7) H :ZeF m/ vs. H x : E £ F m>£ 

is not a ^-distribution if in fact E £ F m ^ for some k < I. The algebraic ge- 
ometrical explanation for these phenomena is that the set F m ^+\ is singular 
along its subset F m ^ ([10], page 491). Note, however, that the set F m ^ + i 
has many other more subtle singularities outside F m £. These singularities 
are poorly understood at present. 

In the case of £ = 1 factor, singularities arise from independence among 
observed variables. Consequently, issues with singularities of one- factor mod- 
els can be avoided if an investigator is free to select variables with pairwise 
correlations that are large enough for the available sample size. However, 
problems with singularities are no longer this simple with more than one 
factor. If £ > 2, then detecting whether an estimate of E is (close to) a sin- 
gularity of F m i is not a matter of merely gauging whether correlations are 
different from zero. In the Appendix we illustrate this for the model F52, 
which at present is the only model with more than one factor for which the 
singular locus is known. The details on the algebraic tangent cones of F52 
given in this Appendix show just how complicated the geometry of seemingly 
simple hidden variable models is. 

7. Conclusion. We considered likelihood ratio tests of semi-algebraic hy- 
potheses. Using Chernoff 's theorem, we showed that under mild probabilistic 
regularity conditions the large sample limiting distribution of the likelihood 
ratio statistic always exists. If the true parameter point is a model singular- 
ity, then the limiting distribution is determined by the tangent cone. Tangent 
cones at singularities are generally nonconvex and lead to nonstandard lim- 
iting distributions that are different from the mixtures of ^-distributions 
that are often encountered in boundary problems. In fact, singularities can 
entail arbitrarily complex limiting distributions because any closed semi- 
algebraic cone of codimension one or larger may occur as tangent cone to a 
real algebraic variety [12]. 

Minima of (possibly dependent) x 2 - ra ndom variables were seen to be im- 
portant for locally identifiable models (recall Proposition 3.5). It would be 
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interesting to find good stochastic bounds on the distribution of such min- 
ima, which could be used to bound p-values when the true parameter point 
is (close to) a singularity. It seems plausible that bounds could be derived 
from special constellations of the equi-dimensional tangent spaces that in- 
duce the x 2 -variables. Moreover, as pointed out by a referee, the "tube" and 
"Euler characteristic" methods may be useful for approximating limiting 
distributions; see [32, 33] and the references therein. 

Factor analysis presents interesting examples of models with singularities. 
Despite its long history and widespread use in practice, these models are 
far from fully understood. Practical assessment of statistical significance in 
factor analysis employs x 2_com P u tations based on the sufficient condition 
in [1], Theorem 12.1. However, little is known about the structure of the 
covariance matrices at which x 2_as y m Ptotics fail and about the nature of 
the nonstandard limiting distributions. In Sections 5 and 6 we were able to 
address these problems for factor analysis with one factor. 

Factor analysis and all other examples considered in this paper were mod- 
els for the multivariate normal distribution. Even in this realm there are 
many other models that could be studied in a similar fashion. For example, 
more general Gaussian hidden variable models as well as structural equa- 
tion models could be considered in lieu of factor analysis. But there are also 
many models for the multinomial distribution that have singularities; see 
for example [13]. The algebraic geometric techniques presented in this paper 
provide a unified approach for future work on the impacts of singularities 
on likelihood ratio tests in different classes of nonsmooth models. 

A key feature of the x 2_ theory for smooth models is that the limiting dis- 
tribution is pivotal, that is, does not depend on where in the null hypothesis 
the true parameter point is located. In some nonstandard problems, such 
as testing submodels of the one-factor model (Section 6.2) this pivotality is 
preserved (or at least stochastic bounds are pivotal). However, our compu- 
tations for the simplest nontrivial two-factor model suggest that even for 
testing problems involving only slightly more general and still seemingly 
simple hidden variable models, the limiting distribution will depend on un- 
known nuisance parameters. One possible approach to circumventing this 
problem is to design bootstrap procedures. While this is a topic beyond the 
scope of this paper, we expect the algebraic framework layed out here will 
be helpful for investigating asymptotic correctness of bootstrap tests in the 
presence of singularities. 

APPENDIX A: DETAILS ON THE FEEDBACK MODEL 

Here we provide details on the feedback model V@ from Example 3.6; 
the same notation is used. 
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A.l. Parameter identifiability. Instead of studying the parameterization 
map f directly, we work with precision matrices S" 1 . For [3 G M 5 and re G 
(0, oo) 4 , let g be the map from {(3, re) to the inverse of the covariance matrix 
f(j3,w) with Ui = l/Ki, The map g:TCR 5 x (0,oo) 4 -► is polynomial 
with g(/3, re) equal to 



/«! +Pil K 2 + /%«3 /?32/?31«3 - #2lK2 -/%1«3 /?24/?2lK 2 \ 

^2 
32 



K 2 + /?32 K 3 -/?32«3 -/?24«2 



^3+/3f3 K 4 -/?43«4 
V K 4 + /3| 4 K 2 / 

(A.l) 

Computations in Maple and Singular [15] yield the following results. 

The Jacobian of g has an 8 x 8-minor equal to a product of powers of 
the Ki. Hence, its rank is 8 or 9 for all (/3, re) G T. Computing the radical 
of the ideal of 9 x 9-minors, we see that the rank is 9 unless (3.2) holds. 
In order to investigate identifiability of V® we perform computations for 
solving the polynomial equations g(/3, re) = g(/3, re) for (/3,re). We find the 
following structure: 

Lemma A.l. The model V@ is globally identifiable at (/3, re) if and only 

if: 

(i) ftl+Z&j&i^O, or 

(ii) /?3i + = and at least one of the parameters (3^2, P24 is 
zero, or 

(hi) /3 3 i + /? 3 2/?21 = and /? 3 2/?43/324 = - I- 

If Ve is not globally identifiable at (/3, re) and S" 1 = g(/3, re), then g~ 1 (S~ 1 ) 
= {(/?, re), (/?, re)} is of cardinality two; compare (3.3). The nontrivial element 
(/9,re) is a rational function of (/?, re). It holds that foi = P21 and rei = k%. 
For i G {2,3,4}, 

(A 2) (3- 1 — + K iKi+i(3ii-i + KiKi-i(3f i _iPf_ li+1 

and 

(A. 3) Ri = Kif}ii-i/i3u-i. 

The expressions in (A. 2) and (A. 3) can be read literally for i = 3; for i = 2, 4 
the indices i ± 1 are to be read modulo 3 such that 4 + 1 = 2 and 2 — 1 = 4. 
Finally, the remaining component ^31 is equal to —Pstfin- 

Lemma A. 2. IfV@ is locally identifiable at (/?, re) G T, i/ien i/ie map g 
is proper at the precision matrix X -1 = (cr 4 - 7 ) = g(/3, re). 
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Proof. Let V C Msym be a compact neighborhood of S" 1 in g(r). We 
assume in particular that V is bounded away from the boundary of the cone 
of positive definite matrices such that the closure of g _1 (V) is contained in 
r (recall that points in T satisfy ^4,3^32^24, !)• By the local identifiability 
assumption, none of the values of P32, P43 and /?24 are zero, which implies 
that the off-diagonal entries a 13 are nonzero if i,j > 2. We may thus assume 
that for all precision matrices S = (sij) £ V, the diagonal entries sa as well 
as the absolute values of off-diagonal entries s« with i,j > 2 are in some 
finite interval [m, M] with < m < M < 00. 

Suppose S = (sij) = g((3, k) G V. Since K{ < sa it follows that all the com- 
ponents of k are in the interval [0, M] . For the components of (3 we have the 
inequalities 

Plm 2 < Pls 2 23 = PlPi 2 4 < s u s22 < M 2 , 

ftV 2 < PllS 2 24 = P21PW2 < 511^44 < M 2 

and 

\Pij\m < \p ijSij \ = $ jKi < Sii < M, E {(4,3), (3,2), (2,4)}. 

The absolute values of the components of (3 are thus all contained in the 
interval [0,M/m]. Hence, ((3,k) is in the compactum [—M/m,M/m] 5 x 
[0,M] 4 . □ 

Proposition 3.5 does not apply to case (ii) in Lemma A.l because if one 
of 032, /?43j P24 is zero, then g is not proper at g(/3, k). This can be seen in 
equation (A. 2) where — > 00 if f3a-\ — > 0. In case (iii) of Lemma A.l, 
Proposition 3.5 does not apply because the rank of the Jacobian of g drops 
from 9 to 8. 

A. 2. Applying algebraic techniques. Example 3.6 was concerned with 
local identifiability. In order to get an understanding of the globally identi- 
fiable cases the techniques from Section 4 are useful. 

Lemma A. 3. A covariance matrix £ = f(/3,w) £ Oo is a singularity if 
and only if P31 +/?32/?2i = 0, in which case the algebraic tangent cone Ae (£) 
is the union of the two hyperplanes with the normal vectors n and fj from 
(3.4) and (3.5). 

Proof. Employing the technique of implicitization ([7], Chapter 3) and 
the software Singular [15], we compute the Zariski closure €>o that is found 
to be given by the vanishing of the irreducible polynomial 

/ = 0-i30-\ A 0-23 ~ 2 C r 13 "l4 (J 230-24 
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+ CT 13 <7i4<724 - cri2Cr 14 cr 2 3£r33 + O"120T30'i4 o '240'33 
+ C r llO"l4C r 23C240"33 ~ HO I3O '\^g\^0 "33 
+ VYl a \tP33 a 3± - Cll^U cr 220'330'U 

- C r 12 CJ 130"l4C r 34 + 0"ll'7l30"l40"22C r 34 

+ 0"l20"l30"l40"230"44 ~ CFnai^aiiCT^Cr 4A 

3 2 

- 0"120" 13 (T240"44 + 0"ii<7 13 Cr 2 30"24C r 44 

- Cri2<7l30T4<7330"44 + 0"llO"l30"l40"220"33(T44 

2 2 2 
+ 012°13°"34<744 — <Jna ^22^3^ AA- 

We can also compute the singularities of ©o , which are the matrices £ = (o"jj ) 
with 0T3 = o"i4 = 0, or equivalently the matrices f (0, w) with /?3i +/?32/32i = 0. 

Since the ideal I(@o) is generated by /, the algebraic tangent cone at a 
singularity £ = f w) is determined by the polynomial /s,min G , which 
factorizes as 

(043 • *13 - *14) [(032043^2 + 0^3 + W 4 ) ■ *13 

- (^3 + /?32 w 2 + 0l 2 02A^i)0i3 " 

The linear forms in the factorization correspond to the vectors in (3.4) and 
(3.5). □ 

The next proposition summarizes what we know about the globally iden- 
tifiable cases from Lemma A.l. Note that case (ii) is an example where the 
parametrization is globally identifiable with full rank Jacobian, but where a 
nonstandard limiting distribution arises for the likelihood ratio statistic. 

Proposition A. 4. Let X n be the likelihood ratio statistic for testing 
(1-1). Let £o = f(/3, u>) be the true covariance matrix. Suppose n^oo. 

(i) // 03i + 32 02i + 0, then X n X \- 

(ii) If 03i + 032021 = and at least one of the parameters 032, 043, 024 is 
zero, then \ n converges to the distribution of a minimum of two ^-random 
variables, which as in Example 3.6 is determined by the cosine p in (3.6). 

(hi) If 03i +032021 = and 0320A302A = — lj then the asymptotic p-values 
for the likelihood ratio test can be bounded as P(xi > t) < Poo(t) — P{x\ > £)• 

Proof, (i) This is the smooth case, which follows from (3.2), Lem- 
mas A. 3 and 4.4, and Theorem 4.3. 

(ii) Under the assumed conditions on 0, the algebraic tangent cone is the 
union of two distinct hyperplanes because rj ^fj. The hyperplane given by 
r] corresponds to the span of the columns of the Jacobian of f at (0,uj). 
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The other hyperplane comprises tangent vectors obtained from diverging 
sequences (j3,u) such that f(/3,(D) — > So- This can be done by plugging the 
expressions in (A. 2) and (A. 3) into the Jacobian of f (note that = 
and computing the column span. Hence, T© (£o) = ^60(^0) is of the same 
form as for the locally identifiable case discussed in Example 3.6. (If ^43 = 
/?24 = or /?43 = /?32 = 0, then p = implies independence of the two \\- 
random variables of which the minimum is taken.) 

(iii) In this case, the normals i] and fj are proportional to each other and 
the two associated hyperplanes cone coincide. Hence, A© (£o) is a hyper- 
plane and P(xi > *) <Poo(£)- Since the rank of the Jacobian of f at ((3,uj) 
is equal to 8, the upper bound on Poo(t) follows from Lemma 4.5. □ 

The tangent cone T© (£o) in case (iii) of Proposition A. 4 seems difficult to 
obtain. However, based on examination of the degree 3-terms in the equation 
that defines Go — So at the origin, we believe that, similarly to Example 1.2, 
T© (So)CA© (S ). 

APPENDIX B: THE SIMPLEST TWO-FACTOR MODEL 

In this appendix we discuss the geometry of ^5,2, the covariance matrix 
parameter space of factor analysis with 5 observed variables and 2 factors. 
The Zariski closure of F52 is the hypersurface defined by the vanishing of 
the pentad 

*12*13*24*35*45 _ *12*13*25*34*45 _ *12*14*23*35*45 + *12*14*25*34*35 

+ *12*15*23*34*45 ~ *12*15*24*34*35 + *13*14*23*25*45 ~ *13*14*24*25*35 
— *13*15*23*24*45 + *13*15*24*25*34 — *14*15*23*25*34 + *14*15*23*24*35- 

Finding the tangent cones of F$ t 2 is an open problem but we can compute 
the algebraic tangent cones of this hypersurface. 

The singularities of F52 are of two types ([10], Example 33). First, there 
are the symmetric matrices with a row (and column) that is off-diagonally 
zero. Consider the matrices 
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with first row and column off-diagonally zero as a representative set. For 
almost all singularities £ = (o"y) of form (B.l), the algebraic tangent cone 
Af s 2 (E) is the irreducible real algebraic variety given by the quadratic poly- 
nomial 



<745(<7240"35 — O340"25)*12*13 — C 35 ((T23CT45 — <734<725)*12*14 ± 
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obtained by replacing the indeterminates t g h with 2 < g < h < 5 in the 
pentad by a g h- However, at special matrices of form (B.l) the algebraic 
tangent cone may be of degree 3 or larger. This occurs if the submatrix 
^{2,...,5}x{2,...,5} satisfies all its tetrads or has an off-diagonal 2 x 2-submatrix 
that is zero. Degree 4 occurs if precisely one entry above the diagonal of £ 
is nonzero. If £ is diagonal, then Ap 52 (Yi) is the pentad hypersurface itself. 

The second type of singularities of F§ )2 is given by symmetric matrices 
that satisfy all those tetrads that do not involve some given off-diagonal 
entry. As a representative set, consider the matrices that satisfy all those 
tetrads that do not involve ayi- We note that these matrices can be parame- 
trized as 
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For almost all singularities 


£ = 




of form (B.2), 


the alj 


^ebraic tangent 



cone Ap 5 2 (Y,) is the irreducible real algebraic variety defined by a quadratic 
polynomial with 18 terms 

<7l50"45^23i34 + 0"34 045^23*15 + C230"34*45*15 _ 0"25C45*34tl3 — Cl5C35*34i24 ± • • ■ 

that are obtained from the six pentad monomials of the form ti 2 t 2 i 3 ti 3 i 4 ti 4 i 5 tu 5 
by dropping t 12 , and replacing t 2i3 t i3 i 4 , t i3 i 4 t i4 i 5 or ij 4 i 5 tii 5 by the correspond- 
ing expression in a g h- The degree of -Ap 5 2 (£) is larger than two if (i) 712 = 
or 722 = such that £ G i*5,i, or (ii) two or more of the coefficients 731, 741 
and 751 are zero which leads to at least two off-diagonally zero rows (and 
columns) in £, or (iii) 711, 721 and at least one coefficient among 731, 741 
and 751 are zero. In case (iii) the matrix becomes block-diagonal; for ex- 
ample, if 711 = 721 = 731 = then £ = diag(£i 2x i2, 0-33, £45x45)- As for the 
singularities of the first type, the algebraic tangent cone admits degree 4 if 
£ has precisely one nonzero off-diagonal entry and degree 5 if £ is diagonal. 

For a generic one-factor matrix £ G -^5,1, the cone ^4i? 52 (£) is given by a 
cubic polynomial with 60 terms 

<7l5C45il2i23*34 + O"120T5 £23^34^45 + 0"34<745*12^23*15 

+ 0"23C34*12t45il5 + 0"12C23*34i45*15 — Cr25C45^12*34il3 
— 0-12(725*34^45^13 _ <7240"45i23*15*13 ± ' ' ' ■ 



The terms are obtained by choosing one of the twelve monomials 

af the five distinct indices ij, 
'tjtj + i"i J+ itj + 2- Here the additions j + 1 and 



*ii*2**2*3**3*4**4*5**5*i m ^ e P en tad and one of the five distinct indices ij, 
U , ,;„• , , by it, ., , , it, 



and replacing t ijlj+ ^ lj+llj+2 
j + 2 are modulo 5. 
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The above algebraic tangent cones depend on the numerical values of the 
entries of a singularity. Since the tangent cones themselves might even be 
more diverse, as was the case with one-factor analysis, we can expect the 
likelihood ratio statistic for testing the two-factor model to admit many 
different limiting distributions. 
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